The Toronto Transit Commission (TTC) is Toronto’s public transportation system. It is a vast network that covers Toronto and the surrounding municipalities, which are collectively known as the Greater Toronto Area (GTA). #map of toronto The bus system is particularly extensive, with routes that reach almost every corner of the city.
The TTC operates subways, buses, and streetcars throughout the region. The data for this analysis includes information on only buses. The data set is skewed because it primarily includes late buses, and does not provide a good representation of on-time buses.
The data set contains data for the first six months of 2022, from January to June.
## Number of rows and columns: 27351 rows, 12 columns
## Number of unique routes (aka busses): 224
The data set contains data for the first six months of 2022, from January to June.
## `geom_smooth()` using formula = 'y ~ x'
#Buses operate 24 hours a day, 7 days a week in Toronto. However, the bus schedule may vary depending on the day of the week and the time of year. For example, buses may run less frequently on weekends and holidays.
## `geom_smooth()` using formula = 'y ~ x'
During rush hour, which is typically between 7 a.m. and 9 a.m. and 5 p.m. and 7 p.m., are buses more likely to be late?
## `geom_smooth()` using formula = 'y ~ x'
The bus is most likely to be late on Sunday, with an average delay of 25 minutes. The least likely day for the bus to be late is Tuesday, with an average delay of only 17 minutes.
##model time_2
timemodel <- lm(min_delay ~ time_posi, data = ttc)
# Summary of the model
summary(timemodel)
##
## Call:
## lm(formula = min_delay ~ time_posi, data = ttc)
##
## Residuals:
## Min 1Q Median 3Q Max
## -21.82 -11.75 -9.00 -1.04 978.25
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.485e+04 2.478e+04 2.214 0.0268 *
## time_posi -3.222e-05 1.456e-05 -2.213 0.0269 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 49.63 on 27349 degrees of freedom
## Multiple R-squared: 0.000179, Adjusted R-squared: 0.0001425
## F-statistic: 4.897 on 1 and 27349 DF, p-value: 0.02691
report::report(timemodel)
## We fitted a linear model (estimated using OLS) to predict min_delay with
## time_posi (formula: min_delay ~ time_posi). The model explains a statistically
## significant and very weak proportion of variance (R2 = 1.79e-04, F(1, 27349) =
## 4.90, p = 0.027, adj. R2 = 1.42e-04). The model's intercept, corresponding to
## time_posi = [?], is at 54851.09 (95% CI [6287.75, 1.03e+05], t(27349) = 2.21, p
## = 0.027). Within this model:
##
## - The effect of time posi is statistically significant and negative (beta =
## -3.22e-05, 95% CI [-6.08e-05, -3.68e-06], t(27349) = -2.21, p = 0.027; Std.
## beta = -6.49e-07, 95% CI [-1.22e-06, -7.42e-08])
##
## Standardized parameters were obtained by fitting the model on a standardized
## version of the dataset. 95% Confidence Intervals (CIs) and p-values were
## computed using a Wald t-distribution approximation.
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'